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Abstract — In the domain of predicting land surface fluxes, 
models are used to bring data from large observation networks 
and satellite remote sensing together to make predictions about 
present and future states of the Earth. Characterizing the 
uncertainty about such predictions is a complex process and 
one that is not yet fully understood. Uncertainty exists about 
inititialization, measurement and interpolation of input 
variables; model parameters; model structure; and mixed 
spatial and temporal supports. Multiple models or structures 
often exist to describe the same processes. Uncertainty about 
structure is currently addressed by running an ensemble of 
different models and examining the distribution of model 
outputs. To illustrate structural uncertainty, a multi-model 
ensemble experiment we have been conducting using the 
Terrestrial Observation and Prediction System (TOPS) will be 
discussed. TOPS uses public versions of process-based 
ecosystem models that use satellite-derived inputs along with 
surface climate data and land surface characterization to 
produce predictions of ecosystem fluxes including gross and 
net primary production and net ecosystem exchange. Using the 
TOPS framework, we have explored the uncertainty arising 
from the application of models with different assumptions, 
structures, parameters, and variable definitions. With a small 
number of models, this only begins to capture the range of 
possible spatial fields of ecosystem fluxes. Few attempts have 
been made to systematically address the components of 
uncertainty in such a framework. We discuss the 
characterization of uncertainty for this approach including 
both quantifiable and poorly known aspects. 
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I. Introduction 

A. Predicting land surface fluxes spatially 

The rising concentration of C0 2 (and other greenhouse 
gases) in the atmosphere, largely created by anthropogenic 
carbon emissions, represents a major threat to the Earth’s 
climate. Yet the growth of atmospheric C0 2 does not equal 
emissions, as half of the emitted C0 2 is sequestered by 
natural reservoirs in the ocean and on land (IPCC, 2007). 
Between 2000 and 2006, for instance, global anthropogenic 
carbon emissions were about 9.1 PgC yr' 1 (petagram of 
carbon per year), while carbon uptake by the ocean and land 
is 2.2 PgC yr' 1 and 2.8 PgC yr" 1 , respectively (Canadell et al., 
2007). In North America, terrestrial ecosystems alone have 
sequestered 0.65 PgC yr" 1 during the same period, offsetting 


one-third of carbon emissions of fossil fuel burning and 
cement manufacturing from the continent (Peters et al., 
2007). However, these natural carbon sinks are limited and 
may diminish in the future (Canadell et al., 2007). 
Knowledge about their properties and how they are going to 
change over time remain challenges to the carbon cycling 
science community (NACP, 2002). 

Ecosystem models provide the primary method for 
mapping regional to global terrestrial carbon fluxes from 
vegetation. They integrate the understanding of ecological 
processes obtained from local measurements and apply this 
knowledge to simulate ecosystem functions over broader 
regions. Since the 1980’s, they have been applied on a spatial 
basis to create maps of predicted output variables. They are 
currently critical tools in the search for understanding the 
fate of carbon in the atmosphere. 

B. Characterizing the scientific uncertainty of predicted 

surface fluxes 

Recognition is growing that results generated from 
ecosystem flux models must be accompanied by a 
quantification of scientific uncertainty. Uncertainty from 
models is unlike the case for measurements, where replicates 
can be obtained under constant conditions to quantify 
precision and comparisons made to a reference or standard to 
quantify bias. Models can be as precise as computational 
methods allow. References or standards are difficult to 
obtain, which is the reason that models are needed in the first 
place. Bias is therefore more difficult to quantify. A 
thorough understanding of the reasons that model results are 
uncertain lead to methods to express a more complete 
distribution of probable output values. 

II. Why predictions are uncertain 

A. Initialization, measurement and interpolation of input 

variables 

Though the primary interest is in uncertainty about 
output variables, models require maps of input variables that 
are themselves uncertain. Several inputs to models that 
describe ecosystem processes (e.g. air temperature, 
precipitation, soil texture) are measured at quasi-point 
support. There is little or no sampling design involved in 
these measurements. Even meteorological networks, which 
provide some of the densest measurements across the 
domain of interest, represent a miniscule sampling fraction, 


have pervasive spatial clustering, and include an unknown 
bias. To supply a model with values at every grid cell for a 
simulation, values of these variables are interpolated across 
vast areas of unmeasured territory. These fields are then used 
in the models and resulting simulations therefore carry this 
interpolation uncertainty. 

Before simulating for the time period of interest, 
initializing or “spinup” runs are usually required to bring 
state variables into equilibrium with climate and ancillary 
datasets. Different models may have different spinup 
algorithms (e.g., Sitch et al., 2003; Thornton and 
Rosenbloom, 2005), leading to another source of variation in 
simulation results. 

B. Model parameters 

Parametric uncertainty has long been acknowledged in 
model exercises. Sensitivity analyses (Saltelli et al., 2004) 
address this aspect of uncertainty. Notable examples in the 
domain include Rnorr and Heimann (2001). 

C. Model structure 

Structural uncertainty arises from different 
representations of ecological processes in different models. 
Because the components of terrestrial ecosystems and the 
interactions among them are complicated or not well 
understood, simplifying assumptions must be made to 
describe them. Different modeling strategies adopt different 
simplifying assumptions, leading to different model 
complexity and behavior. 

For a particular aspect of ecosystem function, structural 
differences of ecosystem models may be examined by 
directly comparing their mathematical formulation (e.g. 
Adams et al., 2004). However, characteristics of isolated 
model components do not fully reflect their functional 
behavior within the coupled system, where feedbacks and 
interactions between subsystems can play a critical role. 
Mathematical analysis of complicated systems can only go 
so far, so numerical experiments with multi-model ensemble 
(MMEs) have become the main means to tackle the problem. 
In general, an MME experiment runs a group of models with 
the same input data and under the same initial conditions; the 
multi-model means or medians are treated as the “best” 
simulation results, and the inter-model differences are used 
as a measure of the structural uncertainty (Tebaldi and 
Knutti, 2007). Following the successful example of the 
World Climate Research Programme’s Coupled Model 
Intercomparison Project for the International Panel on 
Climate Change (IPCC), MME experiments are now broadly 
adopted in carbon-cycle studies, such as the Atmospheric 
Tracer Transport Model Intercomparison Project (Rayner 
and Law, 1995; Denning et al., 1999; Gurney, 2004). Despite 
the general agreement in the simulation results and the 
implied scientific significance (e.g., see Schimel et al., 
2007), large differences among the models are also revealed, 
for instance, in estimates of contemporary global annual NPP 
(39.9-80.5 PgC yr-1; Cramer et al., 1999), or in the 
sensitivity of carbon storage to future climate change in the 
US (-39% - +40%; VEMAP, 1995). 


D. Spatial and temporal support 

Ecosystem models are not necessarily conceived with 
specific spatial or temporal supports in mind. That is, the 
notion of spatial unit size or duration often does not feature 
explicitly in model construction. Therefore, the only 
technical difference between running a model on a 1 km, 8 
km, or 1 degree support may be the number of compute 
cycles needed - model structural elements may not be 
changed yet there is an implicit change in assumptions about 
what regions or periods have stationary parameters. For 
example, light use efficiency (LUE), a critical factor in many 
diagnostic models, may be assumed to be stationary across 
an ecoregion or across a small grid cell. These assumptions 
lead to problems in calibration and validation, as the 
spatial/temporal unit modeled may be much larger/longer or 
much smaller/shorter than the spatial/temporal unit 
measured. Raupach et al. (2005) refers to this as the “scale 
mismatch problem.” 

Ill . Creating A multi-model ensemble with the 
Terrestrial Observation and Prediction System 

A. TOPS 

We have been constructing a MME using the Terrestrial 
Observation and Prediction System (TOPS) to evaluate 
sources of uncertainty in carbon flux estimates resulting from 
structural differences among ecosystem models. TOPS is a 
data and modeling system to accomplish ecological 
monitoring, forecasting and related ecosystem analyses 
(Nemani et al., 2009). TOPS brings together meteorological 
records, satellite products, and various ancillary datasets 
from different sources. One of its key components is the 
Surface Observation and Gridding System (SOGS), which 
ingests daily observations of temperature, precipitation and 
other fields from meteorological stations and interpolates 
them to complete grids (Thornton et al., 1997; Jolly et al., 
2005). 

B. Input variable datasets 

The spatial and temporal extent of our study covers the 
entire North American continent at 8km resolution for the 
time period 1982 to 2006. We used SOGS to generate daily 
meteorological fields. The source data on these variables 
were obtained from the Global Summary of the Day (GSOD) 
and the Cooperative Summary of the Day (TD3200) from the 
National Climatic Data Center (NCDC). GSOD is a global 
set based on data exchanged under the World Meteorological 
Organization World Weather Watch Program. GSOD has 
about 2000 reporting stations over North America, which is 
relatively sparse for interpolating climate variables such as 
daily precipitation over the whole continent. For this reason, 
we added the TD3200 network, which consists of about 8500 
reporting stations in the US, primarily from the National 
Weather Service cooperative station network and principal 
climatological stations, significantly increasing the density of 
stations over the US. 

For models requiring satellite-derived vegetation data, we 
used the leaf area index (LAI) dataset developed by Ganguly 
et al. (2008) The MODIS land cover product (Friedl et al, 
2002) is used to describe the distribution of plant functional 



types (PFTs). A global dataset of land surface parameters, 
ECOCLIMAP (Masson et al., 2003), is used to specify soil 
properties (e.g., texture and depth) and other parameters 
(e.g., albedo). 

C. Ecosystem models 

The architecture of TOPS provides a flexible interface 
for ecosystem models to be integrated. Public versions of 
four process-based ecosystem models, including Biome- 
BGC, LPJ, CASA, and TOPS-BGC comprised the ensemble 
(Table 1). Details of the application of these models are 
given in Wang et al. (unpublished). 


TABLE I. Selected characteristics and attributes of 
MODELS USED IN THE MME. 



Biome- 

BGC 

LPJ 

TOPS- 

BGC 

CASA 

Type 

prognostic 

prognostic 

diagnostic 

diagnostic 

Land 

Cover 

prescribed 

simulated 

prescribed 

prescribed 

LAI 

simulated 

simulated 

prescribed 

prescribed 

Carbon 

Pools 

vegetation 
and soil 

vegetation 
and soil 

none 

vegetation 
and soil 

GPP’ 

algorithm 

Farquhar 

(1980) 

Farquhar 

(1980) 

LUE 

2xNPP 

NPP 1 ’ 

algorithm 

GPP - AR C 

GPP - AR 

0.5xGPP 

LUE 

NEE d /HR' 

algorithm 

NPP - HR; 
HR 

estimated 
from soil 

pools 

NPP - HR; 
HR 

estimated 
from soil 
pools 

NPP-HR; 

HR 

estimated 
from base 
respiration 
rates 

NPP 

HR; HR 
estimated 
from soil 
pools 

Dynamic 
N f Cycle 

yes 

no 

no 

no 

Reference 

Thornton et 
al., 2002 

Sitch et al., 
2003 

Nemani et 
al., 2009 

Potter et 
al., 1993 


a Gross Primary Production b Net Primary Production c Autotrophic respiration d Net Ecosystem 

Exchange e Heterotrophic respiration f Nitrogen 


IV. Results and Discussion 

Model outputs include monthly carbon fluxes (e.g., GPP, 
NPP, NEE, AR) and annual averaged carbon stocks (biomass 
and soil carbon pools). Global summaries as well as 
geographic patterns of all of these outputs show considerable 
ranges. For example, GPP from the four models (Figure 1) 
coincides in some regions of highs and lows, but 
significantly differs in other regions. Though four models are 
too few to accurately estimate ensemble summary statistics, 
some quantification of uncertainty is possible by examining 
the ranges spanned by model results. 

A natural approach to allowing robust estimation of 
summary statistics of future MMEs is to boost the number of 
models represented. Given the multitude of sources of 
uncertainty (Section II), it might make more sense to address 
the range of unknowns in a more systematic fashion. For 
example, model hierarchies could provide increased 
understanding (Wang et al., 2009). Held (2005) suggests 
such an approach for climate models, which have similar 
challenges. 

Flux tower measurements provide the most useful 
reference data (Baldocchi, 2003) for MME results, but the 
available sample of tower data is extremely sparse, not 


representative of the population of values, and their 
uncertainties are not fully characterized. Challenges to 
validation will persist for the foreseeable future. 

V. Conclusions 

MMEs are computationally expensive and time- 
consuming to construct and analyze. However, scientific 
knowledge is not fully captured by a single model. Our 
experiment with a high spatial resolution MME indicates 
there is large structural uncertainty in simulating carbon 
fluxes for North America, equal to or exceeding uncertainty 
due to input variable prediction or parameter estimation. 
Though it might be reasonable to suppose that a larger 
research effort should reduce scientific uncertainty, this 
study and other recent work (Chen et al., 2008; Mitchell et 
al., 2009) suggests the contrary. As the sources of 
uncertainty are increasingly acknowledged and quantified, 
the probability distribution of output values will become 
wider rather than narrower. 
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